Modelling Policies in MDPs in Reproducing Kernel Hilbert Space

نویسندگان

  • Guy Lever
  • Ronnie Stafford
چکیده

We consider modelling policies for MDPs in (vector-valued) reproducing kernel Hilbert function spaces (RKHS). This enables us to work “non-parametrically” in a rich function class, and provides the ability to learn complex policies. We present a framework for performing gradientbased policy optimization in the RKHS, deriving the functional gradient of the return for our policy, which has a simple form and can be estimated efficiently. The policy representation naturally focuses on the relevant region of state space defined by the policy trajectories, and does not rely on a-priori defined basis points; this can be an advantage in high dimensions where suitable basis points may be difficult to define a-priori. The method is adaptive in the sense that the policy representation will naturally adapt to the complexity of the policy being modelled, which is achieved with standard efficient sparsification tools in an RKHS. We argue that finding a good kernel on states can be easier then remetrizing a high dimensional feature space. We demonstrate the approach on benchmark domains and a simulated quadrocopter navigation task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reproducing Kernel Space Hilbert Method for Solving Generalized Burgers Equation

In this paper, we present a new method for solving Reproducing Kernel Space (RKS) theory, and iterative algorithm for solving Generalized Burgers Equation (GBE) is presented. The analytical solution is shown in a series in a RKS, and the approximate solution u(x,t) is constructed by truncating the series. The convergence of u(x,t) to the analytical solution is also proved.

متن کامل

Solving Fuzzy Impulsive Fractional Differential Equations by Reproducing Kernel Hilbert Space Method

The aim of this paper is to use the Reproducing kernel Hilbert Space Method (RKHSM) to solve the linear and nonlinear fuzzy impulsive fractional differential equations. Finding the numerical solutionsof this class of equations are a difficult topic to analyze. In this study, convergence analysis, estimations error and bounds errors are discussed in detail under some hypotheses which provi...

متن کامل

The combined reproducing kernel method and Taylor series for solving nonlinear Volterra-Fredholm integro-differential equations

In this letter, the numerical scheme of nonlinear Volterra-Fredholm integro-differential equations is proposed in a reproducing kernel Hilbert space (RKHS). The method is constructed based on the reproducing kernel properties in which the initial condition of the problem is satised. The nonlinear terms are replaced by its Taylor series. In this technique, the nonlinear Volterra-Fredholm integro...

متن کامل

A Note on Solving Prandtl's Integro-Differential Equation

A simple method for solving Prandtl's integro-differential equation is proposed based on a new reproducing kernel space. Using a transformation and modifying the traditional reproducing kernel method, the singular term is removed and the analytical representation of the exact solution is obtained in the form of series in the new reproducing kernel space. Compared with known investigations, its ...

متن کامل

Solving multi-order fractional differential equations by reproducing kernel Hilbert space method

In this paper we propose a relatively new semi-analytical technique to approximate the solution of nonlinear multi-order fractional differential equations (FDEs). We present some results concerning to the uniqueness of solution of nonlinear multi-order FDEs and discuss the existence of solution for nonlinear multi-order FDEs in reproducing kernel Hilbert space (RKHS). We further give an error a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015